An Efficient Density Conscious Subspace Clustering Method using Top-down and Bottom-up Strategies

نویسنده

  • M. Suguna
چکیده

Clustering high dimensional data is an emerging research field. Most clustering technique use distance measures to build clusters. In high dimensional spaces, traditional clustering algorithms suffers from a problem called “curse of dimensionality”. Subspace clustering groups similar objects embedded in subspace of full space. Recent approaches attempt to find clusters embedded in subspace of high dimensional data. Most of the previous subspace clustering works discovers subspace clusters, by regarding the clusters as regions of higher densities. The regions are identified dense if its density exceeds the density threshold. As the cluster densities varies in different subspace cardinalities, it suffers from a problem called “density divergence problem”. We follow the basic assumptions of previous work DENCOS. It is found that varying region densities are used to overcome density divergence problem. All previous approaches are based on bottom-up method. In this paper a novel data structure is used which works on both bottom-up & top-down fashion. Performance results of this new novel data structure shows very good results and the efficiency outperforms the previous works.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering in applications with multiple data sources - A mutual subspace clustering approach

In many applications, such as bioinformatics and cross-market customer relationship management, there are data from multiple sources jointly describing the same set of objects. An important data mining task is to find interesting groups of objects that form clusters in subspaces of the data sources jointly supported by those data sources. In this paper, we study a novel problem of mining mutual...

متن کامل

Detecting Outlying Subspaces for High-Dimensional Data: A Heuristic Search Approach

In this paper, we identify a new task for studying the outlying degree of high-dimensional data, i.e. finding the subspaces (subset of features) in which given points are outliers, and propose a novel detection algorithm, called HighD Outlying subspace Detection (HighDOD). We measure the outlying degree of the point using the sum of distances between this point and its k nearest neighbors. Heur...

متن کامل

Density-Connected Subspace Clustering for High-Dimensional Data

Several application domains such as molecular biology and geography produce a tremendous amount of data which can no longer be managed without the help of efficient and effective data mining methods. One of the primary data mining tasks is clustering. However, traditional clustering algorithms often fail to detect meaningful clusters because most real-world data sets are characterized by a high...

متن کامل

DensEst: Density Estimation for Data Mining in High Dimensional Spaces

Subspace clustering and frequent itemset mining via “stepby-step” algorithms that search the subspace/pattern lattice in a top-down or bottom-up fashion do not scale to large high dimensional data bases. Recent “jump” algorithms directly choose candidate subspace regions or patterns. Their scalability and quality depend heavily on the rating of these candidates as mislead jumps incur poor resul...

متن کامل

Automatische Parameterbestimmung durch Gravitation in Subspace Clustering

Zusammenfassung Im Vergleich zu den traditionellen Clusteringverfahren ermöglicht Subspace Clustering die Suche nach Clustern in den Unterräumen (Subspaces) der Daten. Man unterscheidet zwei Hauptarten des Subspace-Clustering-Verfahrens: Top-Downund Bottom-Up-Verfahren. Die Algorithmen des Top-Down-Verfahrens verkleinern die Suchbereiche von hohen zu niedrigen Dimensionen. In dem Bottom-Up-Verf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014